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SUM  ARY 


A  key  goal  in  planning/designing  future  WWDSA/DSN/DCS  architectures/ 
systems  is  to  provide  their  subscribers,  under  varying  environments/ 
stress  levels,  an  acceptably  high  probability  of  reliably  completing  their 
required  information  transfers  within  a  time  required  for  mission 
accomplishment.  In  support  of  this  goal,  a  measure  of  effectiveness  (M.O.E.) 
is  formulated  as  the  probability  given  an  environment/stress  level  that  a 
subscriber  to  subscriber  information  transfer  is  completed,  reliably,  within 
an  acceptable  delay.  The  M.O.E.  is  described  for  a  general  case  but  is  only 
applied  to  two  special  cases  assuming:  a  digital  path;  no  queueing  delays; 
all  failure  events  are  Poisson  distributed;  and  all  times  to  recover  from 
failures  are  exponentially  distributed.  Furthermore,  failure  events  include 
long  term  outages  (=-60  seconds)  as  well  as  short  term  (^60  seconds) 
degradations  of  bit  error  rate  which  fall  below  an  acceptable  threshold  for 
the  mode  used  (voice  or  data). 

In  the  first  case  studied,  which  is  similar  to  AUT0V0N  overseas, 
probabilities  are  explicitly  modelled  for  conditions  where  successful 
information  transfers  are  prevented  by  failures  (long/short)  or  delayed  by  the 
time  needed  to  restore  the  path.  Mean  rates  of  short  term  failure  and 
"repair"  are  developed  as  a  function  of  bit  rate  (B)  and  average  error 
probability  (P).  The  parameters  of  the  M.O.E.  are  shown  to  be  long  and  short 
term  average  path  "failure"  and  "repair"  rates,  as  well  as  the  average 
information  transfer  duration  (1/M).  (Information  transfer  durations  are 
assumed  to  be  exponentially  distributed.)  The  M.O.E.  parameters  are  changed 
to  emulate  the  effects  of  increasing  stress,  and  the  endurability  enhancements 
of  faster  acting  tech  control  .  In  the  context  of  DCS-tactical  paths,  the 
additional  effects  of  changing  P  and  B  are  also  evaluated.  Relative 
robustness  of  higher  B  for  voice  under  degraded  P  is  shown,  and  the  need  for 
higher  data  reliability.  ITSTEC  is  shown  to  be  a  substantial  improvement,  but 
effectiveness  will  still  be  severely  limited  by  short  term  reliability,  for 
reliability  averaged  over  the  information  transfer  (average  duration,  1/M)  is 
shown  to  be  asymptotic  to  M/(M+F|_+Fs),  where  F|_  and  F$  are  the  mean 
long  and  short  term  failure  rates. 

The  search  for  a  more  endurable  system  next  considers  the  second  case, 
dynamic  "repair",  of  the  form  where  information  transfer  attempts  which  fail 
are  repeated,  until  reliably  completed.  Here,  reliability  (R)  (neglecting 
undetected  errors)  is  asymptotic  to  1.  Analysis  of  the  case  with  long  and 
short  term  parameters  and  variable  information  durations  requires  conversion 
from  s  to  Z  transforms  for  explicit  time  domain  solution,  and  is  not  presented 
here.  However,  evaluation  of  the  M.O.E.  for  this  case  is  mathematically 
equivalent  to  finding  the  impulse  response  of  an  infinite  impulse  response 
(HR)  digital  filter.  Instead,  the  M.O.E.  is  derived  for  a  constant 
information  transfer  duration  with  short  term  failures  and  "repairs",  with  and 
without  dynamic  "repairs".  As  the  delay  approaches  infinity,  Pr[delay  <  t]  x 
R  approaches  1,  with  dynamic  repair,  and  approaches  R  without  it.  Performance 
variation  as  a  function  of  error  probability,  P,  and  measurement  interval  are 
studied,  for  data  and  voice,  with  and  without  dynamic  repair.  For  voice  the 
useful  range  of  P  is  shown  to  be  the  same  with  and  without  repeats. 
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Finally,  assuming  short  term  Poisson  failures  and  exponentially 
distriDuted  repair  times,  and  an  n  packet  message  of  constant  duration,  an 
expression  for  the  M.O.E.  (Pr(delay^  t)  x  R)  is  derived  and  physically 
interpreted.  The  response  time  improvements  of  packetizing  data  are  also 
shown. 

In  view  of  the  poor  reliability  of  no  dynamic  repair,  and  the  intolerable 
delay  of  dynamic  repair  via  repeat  for  voice,  dynamic  repair  via  fast 
switching  deserves  study  and  consideration. 


I.  INTRODUCTION 


The  purposes  of  this  paper  are  to  present  a  conceptual  measure  of 
effectiveness  (M.O.E.)  and  ':o  demonstrate  that  the  M.O.E.  can  provide  (a)  a 
framework  for  imbedding  many  currently  incoherent  architectural  bits  and 
pieces,  and  (b)  a  focal  point  for  measuring  the  contribution  of  a  piece  and 
viewing  it  in  relationship  to  that  of  other  pieces.  This  effort  was  done  in 
support  of  the  DCS  Plan  FY  90-95,  and  is  intended  to  contribute  to  improved 
planning  of  future  DCS  and  WWDSA  arch i tectures. 

II.  BACKGROUND 

In  support  of  tne  effort  of  recommending  a  future  DCS  architecture, 
current  relevant  architectural  studies  were  reviewed  and  summarized, 
including:  WWMCCS  [1,2,3],  WWSVA  [4],  WWDSA  [5,6,7],  DSN  [8,9],  ITSEC  [10], 

I  ASA  [11,12],  EISN  [13],  and  MILSATCOM/DSCS  [14,15,16,17].  In  attempting  to 
summarize  individually  interesting  features  of  each  system,  one  rapidly  loses 
sight  of  the  forest  for  the  trees.  Therefore,  it  was  decided  to  approach  the 
problem  of  summarizing  key  architectural  features  by  first  constructing  a 
picture  of  the  forest  (the  M.O.E.)  from  its  trees  (elements  of  the  M.O.E.), 
and  then  relating  the  architectural  features  observed  to  the  M.O.E.  elements. 
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III.  TECHNICAL  DISCUSSION 


An  architectural  description  of  a  system  is  viewed  as  a  description 
sufficient  to  allow  one  to  trace  representative  subscriber-to-subscriber(s) 
information  transfer  (i.t.)  paths  across  subsystem  and  management  boundaries. 

It  should  also  permit  an  evaluation  in  some  sense  of  the  goodness  of  the  i.t. 

Consider  that  the  fundamental  purpose  of  a  communications  system  is  to 
provide  transfer  of  information  subscriber-to-subscriber(s) ,  within  a 
timeliness  consistent  with  subscriber  mission  accomplishment.  For  if 
timeliness  were  not  the  Key  requirement,  slower  physical  transport  of  the 
information  could  be  employed. 

At  this  point,  a  measure  of  goodness  for  a  representative  subscriber 
pair  (or  conference  group)  could  be  the  probability  that  the  information  is 
transferred  between  {among)  them  within  an  acceptable  delay.  Such  a 
probability  could  be  estimated  from  a  sample  of  representative  subscriber 
pairs  (groups),  and  over  a  suitable  span  of  exogenous  stress 
environments/stress  levels.  Other  important  factors  such  as  quality/accuracy 
and  security  could  be  implicitly  included  in  the  sense  that  a  sample  is  not 
scored  as  successful  unless  thresholds  required  for  these  factors  are  exceeded. 

We  obtain  a  quantitative  expression  for  probability  that  the 
subscrioer-to-subscr ioer  information  is  reliably  transferred  within  an 
acceptable  delay,  D,  given  an  environment,  Ei,  explicitly  modeling  the 
probabilities  that  successful  information  transfers  are  either  prevented  by 
f ai lures/accuracy  degradations  or  delayed  by  time  to  restore.  At  this  stage, 
we  nave  omitted  queueing  effects  in  the  explicit  evaluations,  but  do  indicate 
how  such  effects  could  be  included  using  the  distribution  function  of  queueing 
delay,  given  that  the  path  is  available.  In  the  following  derivation,  to 
simplify  notation  the  given  environment  will  oe  implicitly  contained  in  the 
probability  values  assigned. 

Tne  initial  model  includes  two  mutually  exclusive  compound  random 
events,  namely, 


[a*-  a(re)  J  (W)  T  r 


(1) 


Here,  a  is  the  event  that  the  path  is  available  at  the  instant  of  demand,  a  is 
the  complement  of  a,  re  is  tne  event  tnat  the  path  is  restored/repaired  within 
some  time  delay  given  it  was  not  available  fa),  r  is  the  event  that  the  path 
is  reliable  over  interval  T  «■  W  given  it  was  available,  where  T  is  the 
information  transfer  interval  and  W  is  the  waiting  time  on  queue. 

To  obtain  the  probability  of  reliably  completing  the  information 
transfer  within  acceptable  delay  D,  one  can  find  the  probability  density 
function  (pdf)  of  tne  total  delay  for  each  of  the  compound  events  and 
integrate  the  pdf  from  0  to  D. 


2 


One  can  partition  the  avai labi 1 ity,  reliability,  and  repair  events  into 
1 ong  ( L)  and  short  ( S)  term  components,  i.e.,  >1  minute  and  <  or  =  to  1 
minute,  in  accord  with  current  engineering  practice.  This  partitioning 
results  in  the  following  four  compound  events  leading  towards  potential 
success : 


[at  •-  (al)  (reL)]  [aS  t-  (aS)(reS)]  WT(rL)(rs).  (2) 

Dynamic  repairs,  i.e.,  repairs  taking  place  during  the  information  transfer 
interval,  are  discussed  later  in  this  TN.  Two  forms  are:  (a)  Automatic 
Repeat  Request  (ARQ),  and  (b)  use  of  redundancy  with  fast  switching  to  enhance 
rel iaoi 1 i ty/avai labi 1 i ty  by  sensing  degradation  on  the  path  and  switcning  in  a 
replacement  element  prior  to  potential  path  failure.  Only  the  ARQ  case  is 
discussed  in  this  paper. 

By  way  of  physical  explanation  of  a  sample  term  from  equation  (2), 
consider  the  final  term,  which  is 


a1,  rel  aS"  reS  WT  (rL)  (rS) 

This  is  the  event  that  the  path  is  not  long  term  available  ( aL) ,  it  is 
repaired  (rel)  thereby  oecoming  available,  but  is  not  available  short  term 
(a5),  is  repaired  (reS),  being  now  available  waits  on  a  queue  a  time  (W),  is 
of  duration  (I),  arid  stays  reliable  long  term  (rL),  and  given  that  condition 
stays  reliable  short  term  (r S).  The  resulting  delay  incurred  in  completing 
sucn  an  event  is  reL  *•  reS  ►  W  *•  T,  where  these  are  random  variables,  and  by 
assumption  are  independent. 

Since  the  total  tune  delay  to  completion  of  tie  information  transfer  is 
the  sum  of  a  numoer  of  (by  assumption)  independent  random  variables,  the 
probability  density  function  (pdf)  of  the  total  time  delay  to  completion  is 
the  convolution  of  the  pdf's  of  the  individual  random  variables.  One  can  then 
ootain  the  correspond  mg  probability  that  the  total  delay  to  completion  is<r 
D,  i.e.,  the  cumulative  distribution  function  (CDF)  by  integrating  the  pdf 
from  time  0  to  D.  finally,  multiplying  the  CDF  by  the  probability  that  the 
path  does  not  fail  over  the  information  transfer  interval  T  will  result  in  a 
joint  probability  function  which  measures  the  probability  that  tne  information 
is  completely  transferred  within  time  D,  and  the  path  does  not  fail  over  the 
information  transfer  interval  T.  We  will  let  the  queue  wait  W=0  in  the 
ensuing  analysis,  in  order  to  keep  it  more  analytically  tractable.  We  note 
that  given  a  density  function  for  the  queue  waiting  time,  and  if  independence 
still  nulls,  queueing  effects  could  be  included. 

Con  >  i  ler  next  lerin-Dy-tann  events  and  probabi  1 1  ties  involved  in  the 
delay  computation,  omitting  queue  delays  and  excluding  for  now  the  reliability 
terms  which  are  a  common  multiplier.  The  first  such  compound  event  in 
equation  (?)  is  (aL)aS(T).  Assuming  the  duration  T  is  exponentially 
distributed  with  mean  of  f  -  1/M,  the  prooability  density  function  for  this 
event  is  ; AL ) f  AS  ? [Moxp ( -MT ) dT ] .  Here  AL  is  tne  probability  of  event  aL,  or 
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long  term  (  p*  or  =  1  minute)  avai laoi 1 ity .  AL  is  estimated  as  l-(sum  total  of 
outages  >  or  =  to  1  minute  duration  divided  by  the  total  measurement 
interval).  (Under  a  stress  environment  such  outages  could  include  those 
produced  by  enemy  attack,  hence  survivability  is  implicitly  included.)  AS, 
the  probability  of  event  aS,  is  the  short  term  (< 1  minute)  point 
availability,  estimated  as  l-(sum  total  of  outages  <  1  minute  duration  divided 
by  a  total  measurement  interval  which  excludes  outages  whose  duration  are  = 
or  >-1  minute).  Thus  AS  is  a  measure  of  the  probability  of  obtaining 
acceptable  quality  (i.e.,  for  voice)  or  accuracy  ( i . e . ,  for  data)  given  the 
path  is  long  term  available.  Integrating  the  pdf  from  T=0  to  D  produces  a 
contribution  to  the  COF  of  (AL) (AS)[ l-exp(-MD)]. 

The  second  compound  event  in  equation  (2)  is  aL (reL)aS(T) ,  with  a  pdf  of 
( 1  - AL )  U  ( t- 1 / 60 ) [ML  exp( -ML ( t- 1 / 60 ) )dt]  AS  [Mexp( -MT)dT ] .  Here  U(t-l/60)  is 
a  unit  step  function,  value  0  for  t  <  1/60  hour,  value  1  for  t  =»  or  =  1/60 
hour.  ML  is  the  average  repair/restoral  rate  of  outages  >  or  =  1  minute, 
i.e.,  ML  is  1/MTTR,  where  MTTR  is  the  mean  time  to  repair/restore  the  path  in 
hours.  The  probability  that  the  delay  t*T  is  <  or  =0,  found  by  integrating 
the  joint  pdf  over  the  region  of  the  t,T  plane  bounded  by  t=1/60,  T=0  and  the 
line  te T=D,  is 


U ( t - 1 / 60 )  ( 1 -AL)  (AS)  [l/(M-ML)J[M(l-exp[-ML(0-l/60)J  -  ML ( 1 -exp[-M( 0-1 /GO) ] ]  . 


The  third  compound  event  in  equation  (2)  is  aL(aS)reS(T) .  The  resulting 
contr i out  ion  to  the  COF  of  delay  is: 

AL(l-AS)  [ 1 /( M-MS ) J  [M( 1 -exp[-MSOj)-MS( 1 -exp(-MD])  , 

where  an  exponential  pdf  for  short  term  repairs/restorals,  mean  rate  MS,  has 
been  used.  Strictly  speaking,  the  conditional  pdf  for  short  term  repairs  <1 
minute  should  be  of  the  form 


( l/(  1  -exp(  -b/60) )  jb  exp(-ot)  0<td/60  hour. 


0  for  t  -  1/60  hour,  which  produces  an  average  short  term  repair  time  (in 
hours)  of  i/b  -(1/60)  exp  [-b/60]/( l-exp[ -b/60]) .  For  the  average  short  term 
repair  times  subsequently  derived  and  used  in  this  report,  which  are  at  most  a 
few  seconds,  exp  (-b/bO)  <<  1/b  and  1.  Hence,  using  the  density  function 
MSexp(-MSt)  0  <(  t  <(  «  for  values  of  1/MS  up  to  a  few  seconds  has  negligible 
effect  on  the  results. 
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Tne  fourth  compound  event  in  equation  (2),  aL(reL)aS(reS)T,  produces  a  CO f  of 

U ( 0- i / 60 ) ( 1  - AL ) ( 1  - AS ) ( I /(MS-ML ] [M-ML ] [M-MS] )  x 

(M[(M-MS)MS(  I  -exoL'-ML (0-1/60)  ] }-(M-ML )ML  ( 1 -exp[ -MS(  D- i /<S0  j  ] )  j  f 

MLMS( MS-ML ) ( 1 -exp[ -M(D-) /60) ]) ]  . 


Adding  tne  four  contr ioutions  to  the  CDF  derived  thus  far  produces  the 
desired  probability  of  delay  to  completion  of  tne  information  transfer.  At 
0= 00 ,  the  C0F=1.  For  a  given  information  transfer  of  duration  T,  the 
probability  of  compound  event  rLrS  is  assumed  to  be  exp(-FLT)  x  exp  (-FST). 
Here,  long  term  failure  rate  FL  is  (1/mean  tune  to  fail)  given  tne  path  was  up 
at  the  start  of  any  sample  interval  and  counts  only  failures  >  1  minute. 

Snort  term  failure  rate  FS  is  similar  but  counts  as  failures  only  those 
involving  measurement  intervals  <  or  =  1  minute.  Since  the  probability  of 
occurrence  of  T  lying  between  T  and  T*-dT  is  Mexp( -MI )dT,  the  average 
probability  of  not  failing  over  interval  T  is 


J  Mexp(  -MT)exp[- (FI*- FS)T  JdT  =M/(  Mt-  F«_+- FS ) . 

o 

In  most  of  the  following  derivations  and  curves  we  shall  employ  this  average 
reliability  value. 

Figure  1  shows  some  representat i ve  plots  of  the  product  of  the  probability 
that  the  total  delay  <  or  =0,  and  the  probability  (averaged  over  T)  that  the 
patn  does  not  fail  over  its  information  transfer  duration.  The  abscissa  shows 
time  delay  in  hours.  The  ordinate  is  the  product  of  the  probability  that  the 
information  transfer  is  completed  in  a  time<  or  =  D  and  is  reliable  over  its 
duration.  The  top  curve  is  perfect  performance,  and  is  the  Hr  [d  §  0]  = 
l-e-MD)  the  top  0f  an  exponentially  distributed  information  transfer,  mean 
duration  of  1/M.  This  curve  is  a  very  useful  benchmark  against  which  to  judge 
system  performance.  Noting  that  AL  =  1  /  ( 1  ♦-  FL  /ML )  and  As  =  1  /  (  H  FS/MS) ,  and 
average  reliability,  R,  is  M/(M*-FL*-FS) ,  perfect  performance  occurs  at 
FL=Fb=0.  As  F  approaches  ®  ,  i.e.,  zero  mean  time  to  failure  (MTTF),  the 
worst  possiole  performance  occurs,  Pr[d  iu]=0,  which  is  the  abscissa  in 
Figure  I.  Another  noteworthy  point  is  that,  assuming  highest  precedence 
subscribers  suffer  no  queueing  delay,  el imination  of  queueing  delays  in  the 
current  model  makes  it  representat i ve  of  delays  to  completion  seen  by  highest 
precedence  subscribers  (for  systems  which  do  not  repa ir/restore  during  the 
informut  ion  transfer  interval). 

Next,  using  both  experience  and  calculaole  theoretical  models,  values  for 
FL,  ML,  FS  and  MS  are  derived.  These  values  can  be  associated  with  various 
environments  ami  system  designs.  Tneir  impact  on  performance  is  examined  by 
plotting  tne  Pr[df;i)J  x  P.  Let  us  start  with  a  model  for  a  current  overseas 
digital  Ag FiiVON  DCS.  Consider  first  the  long  term  parameters  FL  and  ML. 
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Assume  an  MTTR  =  1/ML  of  0.5  hour,  counting  only  reported  outages  )>  1/60 
hour.  Further,  assume  AL  =  .99  produces  FL  =  0.0202/hour.  Consider  next  the 
short  term  failure  rate  parameter  FS.  By  definition ,  a  short  term  failure 
occurs  when,  given  the  previous  interval  was  acceptable,  the  numDer  of  error 
Pits  observed  over  a  short  (i.e.,  c  or  *  1  minute)  interval  t  equals  or 
exceeds  an  unacceptable  threshhold.  Let  PU  oe  the  unacceptable  error 
probability  and  let  P  be  an  error  probability  as  averaged  over  1  minute 
intervals  while  the  path  is  in  use. 

We  further  assume,  for  this  initial  model,  a  purely  random  channel  in 
wnicn  the  number  of  error  bits  arriving  in  any  interval  t  is  Poisson 
distributed  [18].  The  probability  of  no  short  term  failures  over  interval  t, 
i.e.,  exp(-FSt),  is  also  tne  Poisson  CDF. 


(PU)Bt-l  j 

2  exp(-PBt) (PBt)  /jl 

j=0  (4) 


where  8  is  the  bit  rate,  PU  is  the  unacceptable  error  probability  threshhold, 
and  (PU)Bt-l  is  the  maximum  number  of  error  bits  allowed  in  interval  t  if  a  no 
failure  event  occurs.  For  the  voice  mode,  we  assume  PU  =  1 0~2 ,  and  t  =  5 
seconds,  hence,  (PU)Bt-l  =  119  error  bits  at  2.4  kb/s  or  799  error  bits  at  16 
kb/s. 

The  Poisson  cumulative  distribution  function  is  tabulated  in  reference 
[19]  as  a  Chi  squared  distribution  with  V  degrees  of  freedom,  Q ( X2/V) ,  where 
X?  corresponds  to  2P8 1  and  V  corresponds  to  2(PU)8t  in  our  problem.  For  V> 
100,  Q ( X2/ v )  (which  we  associated  with  tne  probability  of  acceptable 
reception)  is  approximately  equal  to  Q ( X 1 )  =  l-P(Xl),  with 

XI  --  2(P8t) 522  -  (4(PU)8t-l ) 122  (5) 

where  P ( X 1 )  (the  short  term  probability  of  unacceptable  reception)  is  the 
cumulative  distribution  function  of  the  normal  distribution  with  mean  0, 
variance  1.  Using  the  normal  approximation,  P  =  10'^  and  the 
aforementioned  parameter  values,  we  find  P ( X 1 )  is  10"99-29  at  B  =  2.4  kb/s, 
and  10“558.96  at  8  =  16  kb/s.  The  conditional  probability  that  the  path 
does  not  fail  over  short  interval  t,  i.e.,  exp(-FSt),  is  the  same  as  the  short 
term  probability  of  acceptable  reception,  l-P(Xl).  Hence,  FS  =  -(1/5)  Ln 
(l-P(Xl)).  Since  -Ln(l-P(Xl))  is  nearly  equal  to  P(X1)  for  P ( X 1 )  <<  1,  FS 
(per  5  seconds)  is  about  P(Xl)/5  sec;  hence  the  failure  rate  per  hour  values 
of  FS  are  720  P(X1),  or  3.7  x  10'97  at  2.4  Kb/s  and  7.9  x  10'557  at  16 
Kb/s . 

To  compute  MS,  the  conditional  mean  short  term  repair  rate,  first  consider 
the  compound  event:  acceptable  at  time  0,  unacceptaole  for  the  next  j 
intervals  (duration  jt),  acceptable  over  interval  (jH).  Using  P(X1)  = 
probability  of  being  unacceptable  oyer  interval  t,  then  the  compound  event  has 
a  joint  probability  (l-P(Xl))  P(X!) J  (l-P(Xl)).  Dividing  this  ioint 
probability  by  the  joint  probability  of  being  acceptable  at  c line  0  and 
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acceptable  over  interval  jt-l,  produces  the  conditional  probability  of  being 
unacceptable  (i.e.,  undergoing  "repair")  for  a  continuous  duration  jt  as 
P(X1)J.  Hence,  P(X1)J  =  exp(-Msjt)  =  exp(-MSt)J  and  MS  can  be  computed 
for  the  voice  case  under  discussion,  as  MS  =-(1/5)  Ln  ( P ( XI ) ) .  Results  using 
t=5  seconds  and  P=10~5  are  MS  =  1.65  x  loVhour  at  2.4  kb/s  and  1.092  x 
lO^/hour  at  16  kb/s. 

Since  MS  =-(l/t)  Ln  ( P ( XI ) )  implies  P(X11  =  e-MSt  and  FS  =  -(1/5)  Ln 
(l-P(Xl))  implies  (l-P(Xl))  =  e-FSt,  then  *  e-^St  =  which  says 

that  over  interval  t  the  probability  that  the  channel  is  in  a  failed  or 
non-failed  state  is  1. 

AS  =  1  /( FS/MS  *-1)  is  nearly  1  for  B  >  or  =  2.4  kb/s  when  we  use  the 
previously  derived  values  of  FS  and  MS  at  p  =  10“5. 

Tne  curve  in  Figure  1,  labeled  current  overseas  AUTOVON,  peacetime 
environment,  voice  (P  =  10'^  at  2.4  kb/s)  was  plotted  using  the  parameters 
just  derived,  i.e.,  AL  =  .99,  ML  =  2/hour,  FL  =  .0202/hour,  FS  =  7.2  x  10-97, 
MS  =  1.64  x  1 0^/hour,  AS  =1.  The  Pr[d  g  D]  x  R  is  close  to  ideal 
performance.  The  next  curve,  labeled  current  overseas  AUTOVON,  wartime 
environment,  uses  FL  =  2.0202,  ML  =  2/hour,  AL  =  .4975,  and  the  same  short 
term  parameters.  The  next  curve,  labeled  mid-term  overseas  DSN,  wartime 
environment,  uses  the  same  assumed  wartime  FL  (and  the  same  short  term 
parameters)  but  the  long  term  mean  time  to  restore/repair  has  been  reduced 
from  30  to  3  minutes,  producing  ML  =  20/hour,  AL  =  .908.  This  last  curve  is 
intended  to  show  the  endurability  enhancement  of  a  faster  acting  tech  control 
sucn  as  ITSTEC. 

Next,  noting  that  the  1  minute  error  probability  P  could  increase 
substantially,  particularly  where  tactical  circuits  might  be  part  of  the  path, 
Table  I  shows  how  FS  and  MS  vary  with  P  at  B  =  2.4,  16  and  also  32  kb/s. 
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TABLE  I 


VARIATION  OF  SHORT  TERM  VOICE  MEAN  FAILURE  RATES  (FS) 

AND  REPAIR  RATES  (MS)  WITH  ERROR  PROBABILITY  (P)  AND  BIT  RATE  (B) 


, 

B  =  2.4  kb/s 

B  =»  16  kb/s 

B  = 

32  Kb/s  1 

p 

m 

H9| 

1 

m 

— 

Hi 

^1111 

10*s 

3.69xl0"97 

1.65x10s 

7.89x10-857 

1 .09x10s 

! 0-1 306 

2.17x1 O6 

1 

*3- 

1 

o 

7.7x10-84 

1.43xl05 

2.44x10-554 

9.39x10s 

1 . 88x10s  ; 

10-3 

5.2x10“48 

8.31 xlO4 

1.28x10-324 

5.42x10s 

10-554 

1.09x10s  | 

I 

10'2 

_ 

512.3 

486.1 

_ 

504.2 

— 

494.0 

495.5 

_ 

502.7  | 

_ ! 

A  large  increase  in  FS  occurs  somewhere  between  P=10"2  and  10'3  at  all  of 
the  bit  rates,  i.e.,  a  threshhold  effect  is  evident.  Recalling  tnat  averaged 
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over  T,  R  =  M/(M+FL+FS),  one  can  define  an  onset  of  the  threshhold  effect,  by 
setting  FL  =  0  and  choosing  that  value  of  FS  which  produces  M/(M+FL+FS)  =  0.99. 
Using  M  =  10/hour,  FS  at  threshhold  =  0.101010,  and  since  FS  =  -720  lnQ(Xl), 
Q(X1)  is  .999859718,  where  XI  is  about  -3.632679.  Using  this  XI  value  in 
equation  (5)  with  PU  =  10“^,  T  =  5  seconds,  and  choosing  a  bit  rate  B  gives 
a  value  of  P  corresponding  to  FS  at  the  onset  of  threshhold.  The  threshhold 
values  of  P  are:  9.11  x  10"^  at  32  kb/s;  8.75  x  10" 3  at  16  kb/s;  and  6.94 
x  10"3  at  2.4  kb/s.  One  can  get  a  feeling  for  relative  robustness  of  the 
various  bit  rates,  by  plotting  P  (d  <  D)  x  R  for  each  bit  rate  as  P  degrades 
from  10" 5  tn  6.94  x  10"^  to  8.75  x  10“^  to  9.11  x  10"^.  Some  of  these 
curves  are  shown  in  Figure  1,  as  indicated,  and  also  include  the  effects  of 
the  previously  assumed  wartime  long  term  failure  rate  FL  =  2.0202/hour  and 
ITSTEC  long  term  repair  rate  =  20/hour. 

Just  above  the  lowest  curve  is  one  labeled  computer  data  2.4  kb/s  at 
P  =  10~5.  Parameters  used  were  Fs  =  BP  =  86.4/hour  and  MS  =  -BLn { l-exp(-P) ) 

=  9.947  x  10^/hour.  These  parameters  were  derived  using  a  1  bit  measurement 
interval  in  equation  (4),  hence  t  =  1/B,  using  an  unacceptable  error 
probability  threshold  of  1  bit  in  error.  Note  that  1 /Fs  =  1/BP  is  the  mean 
error  free  interval . 

The  performance  curves  of  Figure  1  for  digital  voice  confirm  the  decisions 
of  TRI-TAC  to  operate  at  higher  bit  rates  such  as  32  or  16  kb/s,  from  the 
standpoint  of  relative  robustness  as  average  erro*-  rate  degrades.  For 
similar,  i.e.,  quality  reasons,  the  WWSVA  study  recommended  retention  of  a 
higher  quality  16  kb/s  rate.  The  performance  curve  for  computer  data  reflects 
the  need  for  more  reliable  network  transmission,  which  has  led  to  development 
and  operation  of  a  separate  network  for  data,  as  exemplified  by  AUTODIN  I  and 
the  forthcoming  AUT0D1N  [I.  The  key  feature  of  such  data  networks  from  our 
model  viewpoint  is  the  ability  to  dynamically  restore  the  information  which 
has  failed.  One  also  notes  (from  Figure  1)  that  although  ITSTEC  with  its 
higher  assumed  mean  restoral  rate  (ML  =  20/hr)  was  a  decided  improvement  over 
the  current  system  (ML  =  2/hour),  effectiveness  will  be  severely  limited  by 
the  short  term  reliability. 

Therefore,  in  our  search  for  more  endurable  systems,  we  next  consider  the 
performance  of  systems  which  can  dynamically  repair  during  the  information 
transfer  interval . 

In  the  section  which  follows,  we  will  show  that  for  systems  employing 
dynamic  repair,  the  joint  probability  that  the  information  transfer  is 
completed  within  a  delay  *  or  =  D,  and  that  the  path  stays  reliable  over  the 
information  transfer  duration,  approaches  1  as  delay  D  approaches  infinity. 

In  the  circuit  switched  case  previously  considered,  the  corresponding 
probability  that  the  path  stayed  reliable  as  averaged  over  the  assumed 
distribution  of  information  transfer  lengths  was  M/(M+FS+FL).  It  is  only  fair 
to  point  out  that  in  the  circuit  switched  case,  we  have  implicitly  assumed 
impatient  subscribers  who  give  up  in  the  event  of  an  information  transfer 
failure.  Conversely,  if  one  assumed  patient  subscribers,  who  are  willing  to 
keep  repeating  their  information  transfer  attempts  whenever  failures  have 
occurred,  they  too  would  ultimately  get  through. 
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Derivation  of  the  case  which  is  comparable  to  the  circuit  switched  case 
considered,  i.e.,  both  short  and  long  term  reliability  and  a  1  minute  delay 
before  repairs  begin,  requires  conversion  to  Z  transforms  before  transform 
inversion  and  is  not  presented  in  this  paper.  However,  we  present  two 
mathematically  simpler  cases  where  closed  form  analytical  solutions  can  be 
readily  obtained. 

First,  we  consider  the  case  of  constant  information  transfer  duration  T. 

A  successful  event  occurs  when  the  path  is  available,  or  if  not  available  it 
is  repaired  and  the  duration  is  T  and  the  path  is  reliable  (over  duration  T); 
or  if  not  reliable  the  path  is  repaired  and  stays  reliable;  or  if  it  doesn't 
stay  reliable,  it  is  repaired  and  stays  reliable;  or  if  .  .  .  and  so  on  ad 
infinitum.  Symbolically,  this  event  can  be  represented  as 


(a+a  re)T(r+r  re(r+r  re(r+r  re(  .  .  .  ) 
which  equals 


=  (a+a  re)T  (1+  r  re  +  (r  re)2  +  (r  re)3  +  .  .  .)  r 


=  (a+a  re)Tr/(l-r  re) 


where,  as  before,  a  and  a  are,  respectively,  the  events  that  the  path  is 
available  and  not  available,  re  is  the  event  that  the  path  is 
repaired/restored  given  it  was  in  a  failure  mode,  and  r  and  T  are  the 
respective  events  that  the  path  stays  reliable  and  not  reliable  over  interval 
T.  The  probability  that  the  delay  to  completion  is  exactly  t  seconds  and  the 
information  transfer  stays  reliable  over  T  seconds  will  be,  assuming 
independence , 


AR(P(T)  +  P(T)*Re  +  ff2p(T)*Re*Re  +  R3p(T)*Re*Re*Re  +  .  .  .) 


+ ( 1 -A)R (Re*P( T)  +  R  Re*P(T)*Re  +  F  Re*P(T)*Re*Re  +  .  .  .) 


where  the  *  is  the  convolution  process  involving  only  those  probability 
elements  introducing  time  delay.  To  be  consistent  with  previous  notation,  let 
AS  =  P(a),  AS"  =  1  -AS  =  P(a),  P(T)  =  6(t-T),  where  S(t-T)  =  1  if  t  =  T,  $  (t-T) 

=  0  if  t  =  T,  and  Re  =  P(re)  =  MSe'^St  is  the  probability  that  the  path  is  _ 
repaired/restored  in  exactly  t  seconds.  Finally,  RS  =  P(r)  and  (1-RS)  =  P(r). 

Following  the  basic  theory  discussed  in  reference  [20],  if  one  Laplace 
transforms  only  those  probability  elements  which  contribute  (additively)  to 
the  time  delay,  i.e.,  those  which  are  convolved,  the  product  of  the  Laplace 
transforms  when  inverse  transformed  will  produce  the  probability  density 
function  of  the  sum  of  the  time  delays.  Furthermore,  to  obtain  the  joint 


probability  that  the  total  delay  <-  or  =  0  and  that  the  path  stays  reliable 

over  interval  T,  one  need  only  multiply  the  product  of  the  Laplace  transform 

of  the  probability  density  function  and  reliability  by  1/s  and  then  invert  the 
Laplace  transform.  For  multiplication  of  a  Laplace  transform  by  1/s,  and 
inversion  will  produce  the  integral  from  0  to  t=D  of  the  original  function. 

We  now  carry  out  the  process  step  by  step,  first  deriving  the  joint 

probability  that  the  delay  =  t  and  that  it  stays  reliable  over  duration  T,  and 

then  deriving  the  product  of  the  cumulative  distribution  function  and  R(T). 
Using  L[P(T)]  =  L[S(t-T)]  =  exp(-sT)  and  L[Re]  =  L[MSexp[-MSt]]  =  MS/(MS+s)  in 
the  closed  form  event  probability  equation 


[ P ( a )  +  P ( a ) P ( re ) ]  P(T)  P(r)/[l-P(r)P(re)] 

=  [AS  +  (l-AS)Re]  P(T)  R/[l-(l-R)Re] 
produces 

L[Pr[delay=t]]  x  R  =  [AS  +  (l-AS)MS/(MS+s)j  exp(-sT)R/[l-(I-R)MS/(MS+s)J 
which  after  simplification  produces: 

L[Pr[delay=t]]  x  R  =  [(ASs+MS)/(s+MSR)]  [exp(-sT)]R. 

Writing  the  term  ASs/(s+MSR)  as  AS  -  AS  MSR/(s+MSR)  produces 
L[Pr[delay=t]]  x  R  =  [AS  +  MS(l-ASR)/(s+MSR)]  [exp(-sT)]R. 

Inverting  the  above  transform  produces 

Pr[delay=t]  x  R  =  ASRS(t-T)  +  MSR(I-ASR)  exp[-MSR( t-T) ]  U(t-T). 

One  also  notes  for  future  use  that 

L[fv[delay=t]]  x  R  =  ASR  exp(-sT)  [1+  MS(l-ASR)/(AS(s+MSR))]  (6) 

L[Pr[del ay  <  or  =  tl]  x  R  =  (1/s)  ASR  exp(-sT)  [l+MS(l-ASR)/(AS(s+MSR))] 

which  after  inversion  produces  the  result  of  interest: 

Pr[del ay  or  =  t]  x  R  =  U(t-T)  [1-  (1-AS  R)  exp(-MSR(t-T))].  (7) 

Observe  that  as  t— *«,  Pr  [delay  <  or  =  ®  ]  k  H  =  limit  as  t— of 
U(t-T);  i.e.,  the  information  is  ultimately  reliably  transferred  with 
probability  =  1.  Also,  using  R  -  exp(-FST)  where  FS  is  the  average  short  term 
failure  rate,  and  recalling  that  AS  =  1/(1 +F S/MS )  and  at  FS=0,  AS=1 ,  R=l, 
results  in  Pr' delays  or  =  t]  x  R  =  U(t-T).  In  other  words,  perfect 
oerformance  is  achieved  at  AS=1,  R=l,  the  only  delay  being  that  due  to  the 
message  interval  T. 
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For  purposes  of  comparison,  a  system  without  dynamic  repair  during  the 
information  transfer  interval  would  have 

l[Pr[delay  <  or  =  t]]  x  RS  -  (l/s)[AS  +  (l-AS)NS/(s+M$)]  exp(-sT)RS 


which  upon  inversion  yields 

Pr[ delay  c  or  =  t]  x  RS  »  U(t-T)  [l-(l-AS)  exp[-MS(t-T)]]RS. 


(8) 


Equation  (8)  also  produces  Pr[delay<  or  = 
as  did  Equation  (7).  However,  as  t  approaches 
approaches  RS  for  equation  (8). 


t]  x  RS  -  U(t-T)  at  AS=1 ,  RS=1 , 
®  ,  Pr[delay<  or  *  t]  x  RS 


Using  equation  (7),  one  can  express  the  time  behavior  of  the  dynamic 
repair  case  as 

t-r  =  -(1/MS  RS)ln[(l-Pr[delay  <  or  =  t]  RS) /( 1 -AS  RS)] 


where  t  >  or  =  T  and  1-Pr  [delays  or  *  t]  x  RS  or  =  1  -AS  RS. 

It  will  be  easier  to  study  the  tradeoffs  on  t-T  if  we  express  AS,  RS  and 
MS,  which  are  not  independent,  in  a  more  fundamental  way.  Recall,  from 
equation  (4)  and  subsequent  developments,  that 


<PUI  ^ '*‘eKp(-PSt*)  (PBt*)J/j!  ■  e-FSt*  =  Q(X2/V) 

J*0 

RS  =  e  -F$T  =  e  - FSt*(T/t *)  =  [Q(x2/V) 
e-MSt*  +  e-FSt*  =  1 
MS  =  -1/t*  ln[P(x2/V)] 

AS  =  1 / (FS/MS  +1) 

Qf  X2/V)  =  1-P(X2/V) 

X?  =  2P  Bt* 

V  =  2{PU)  Bt* 


(10a) 

(10b) 

(10c) 

(10d) 

(10e) 

(10f) 

(10g) 

(lOh) 


13 


where  P (X^/V)  is  the  cumulative  distribution  function  of  a  chi  squared 
distribution  with  V  degrees  of  freedom.  Using  these  relationships,  equation  (9) 
can  be  expressed  as 


(t-T)  =  [t*/(Q(x2/V))T/t*in(p(x2/v))]  x 


|T(1-Pr(delay  <  or  =  t)  x  RS) ( ln[Q(X2/V)  P(x2/Vjl| 


|^ln[Q(x2/V)]  i-  [l-(Q(x2/V))T/t*]  x  ln[P(x2/V)]J 


HD 


By  similar  substitutions,  one  can  also  rewrite  equation  (8)  as 


Pr[delay  <  or  =t]  x  RS  =  U(t-T)  3  l-[ lnQ(X2/V)/ln(Q(X2/V)P(X2/V) )] 


ixp[(t-T)(ln[P(x2/V)])/t*]  \  Q(X2/V)TA 


;  i  i 

t* 


S 


12) 


I  and  (12)  and  relating  P(x2/V)  and  Q(X2/V)  to  the 
and  error  probability  P,  Tables  II  and  III  were 


Using  equations  l 
measurement  interval 
constructed  to  show  how  performance  varies  as  a  function  of  P,  t"  and  system 
category  (dynamic  repair  using  repeat  vs  no  dynamic  repair). 


Taole  II,  the  data  case,  uses  a  constant  information  transfer  duration  of 
1=360  seconds;  1  bit  o^  more  in  error  over  the  measurement  interval  is 
unacceptable,  i.e.,  (PU)3t*  -1=0,  8=2400  b/s,  and  t*  =  1/B  as  well  as 
t*  =  I  second.  Performance  for  the  dynamic  repair  case  using  repeat  is  shown 
in  terms  of  the  value  of  t-T  from  equation  (11),  i.e.,  the  delay  minus  the 
constant  message  interval  T  due  to  path  outages  and  "repairs”,  at  a  "probability 
of  success"  =  .99.  Here,  "probability  of  success"  is  Probability  (delay  <  or 
=t)  x  RS(T).  For  comparison,  the  corresponding  "probability  of  success"  for  the 
non-dynamic  repair  case  is  also  shown  as  derived  from  equation  (12). 


TABLE  I! 


DATA  PERFORMANCE  AS  A  FUNCTION  OF  ERROR  PROBABILITY  P 

Parameters:  T  =  6  minute  message  duration;  Bit  rate  B  =  2400  b/s; 
Measurement  interval  t*  =  1/2400  sec  (1  bit)  or  0.8333  sec  (2000  bits). 


PS,  probability  of  success,  is  probability  [(delay  to  completed  transfer 
-T)  <  or  =  delay,  t]  x  R(T).  Here  R ( T )  is  the  reliability,  or  probability  the 
path  does  not  fail  over  message  interval  T. 


p 

1 

1  DELAY,  t  (secs)  at 

i 

★  ,  .  !  i 

t  (secs)  |  PS  =  .99,  Dynamic 

j  Repair  via  Feedback 

[  i 

PS 

No  Dynamic 

Repai r 

10"3 

1/2400  .  4.7xl0371 

! 

,  1n-376 

6x10 

io"4 

'  *5  *5 

6.9xlOJ 

~  i  r\  *“  38 

3x10 

10-4-8 

153.7  1 

lxlO”6 

IO"4'9 

1  9.0 

1 .9xl0'5 

10"5 

.94 

1 .8x 10-4 

io-6 

2.9xl0”4 

.42 

10"3 

.8333  4.5xl0376 

BxlO-3^ 

io'4 

7.5xl037 

3xl0”38 

IO”5 

5531 .5 

1 .8xl0"4 

io"5,1 

884.0 

lxlO-3 

l0-5.2 

204.1 

4.3  x  10"3 

io-5-3 

63.1 

1 .3xlO'2 

10”8'4 

24.6 

3.2xlO'? 

IO”5'5 

11.5 

6.5xl0'? 

IO”5'6 

6.2 

.11 

io”5,7 

3.7 

.18 

o 

1 

CO 

2.5 

.25 

o 

1 

cn 

1.7 

.34 

10-C 

1.3 

.4? 

io-7 

.23 

.92 
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For  the  case  of  dynamic  repair  via  repeat,  the  delay  increases  sharply 
with  increasing  path  error  probability.  A  shorter  measurement  interval 
appears  to  be  less  robust  in  that  for  a  given  decrease  in  error  probability 
the  increase  in  delay  becomes  greater.  However,  shortening  the  measurement 
interval,  t*,  at  a  given  error  probability  shortens  the  delay.  For  example. 
Table  II  shows  at  P=10"^,  and  Bt*  =  1  bit  (t*  =  1/2400  seconds),  a  .94 
second  delay,  while  at  Bt*  =  2000  bits  (t*  =  .8333  seconds)  the 
correspond ing  delay  is  5531.5  seconds.  This  effect  is  primarily  due  to 
changing  tne  average  path  repair  rate,  MS,  For,  recall  that  for  the  data 
case,  1  bit  in  error  per  measurement  interval  results  in  an  interval 
reliability  RS  =  Q(X^/v)  ^/t*  =  exp(=PBt),  which  is  independent  of  t*. 

However,  MS  =  -  1/t*  ln(P(x2/V))  »  -1/t*  In  (1  -  exp(-PBt*);  thus 
decreasing  t*  increases  MS,  thereby  shortening  the  delay.  Increasing  the 
bit  rate  while  holding  the  number  of  bits  per  measurement  interval  constant  is 
another  way  of  shortening  the  measurement  interval. 

For  the  data  case  without  dynamic  repair,  the  right  hand  side  of  equation 
(8)  approximately  equals  lJ(t-T)RS,  where  RS  =  exp(-PBT).  As  such,  for  T=360 
seconds,  8=2400  b/s,  P =1 .16  x  10-°  is  required  for  PS=.99. 

Next,  the  voice  case  is  considered.  The  delay  and  accuracy  requirements 
for  voice  differ  from  those  of  data.  Voice  telecommunications  should  emulate 
natural  conversati on.  This  requires  very  short  total  delays,  of  the  order  of 
a  small  fraction  of  a  second,  to  avoid  the  adverse  psychological  effects  of 
long  delays  in  replies.  Conversely,  data  telecommunications  do  not  have  such 
real  time  requi remen ts .  Accuracy  requirements  for  voice  are  much  less 
stringent  than  for  data;  e.g.,  relative  threshholds  of  acceptable  error  rates 
are  of  the  order  of  10'"  for  voice  vs  10”9  to  lO'1^  for  computer  data. 

Table  III  essentially  shows,  for  the  case  of  dynamic  repair  using  repeat 
and  assuming  a  6  minute  call  duration,  how  the  delay  at  a  prespecified 
accuracy  level  varies,  as  the  error  probability  P  averaged  over  the  indicated 
measurement  interval  t*  changes.  Additionally,  it  shows  an  associated 
measure  of  accuracy  performance  for  the  case  without  dynamic  repai r  at  t*  = 

5  seconds^  and  the  average  number  of  "failures"  per  call  duration  at  both  t 
=  5  and  t  =  1  second.  A  failure  is  defined  to  occur  when  the  P  averaged 
over  t*  equals  or  exceeds  10'9.  The  tabulated  values  were  derived  using 
equations  (10),  (11),  and  (12). 
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VOICE  PERFORMANCE  AS  A  FUNCTION  OF  ERROR  PROBABILITY  P 


Choice  of  a  5  second  measurement  interval  provides  a  point  of  reference 
for  voice,  based  upon  the  criterion  a  call  fails  on  y  wnen  the  error 
probability  P  equals  or  exceeds  10~^  over  a  5  second  interval.  As  such,  the 
failure  rate  is  no  higher  than  it  need  be  for  voice.  However,  the  time  to 
detect  a  failure  (5  seconds)  far  exceeds  the  short  delay  requirement  for 
vo’ce.  In  effect,  the  useful  range  of  P  values  is  essentially  the  same  for 
both  the  dvnamic  repair  with  repeat  and  no  dynamic  repair  cases. 


The  1  second  measurement  interval  was  chosen  because  it  is  close  to  the 
packet  interva1  for  AUTGDIN  ;'I  data  at  2400  b/s.  The  relative  delay 
performance  ?■“'  dynamic  repair  using  repeat  becomes  even  worse  because 
shortening  the  measurement,  interval  increases  the  number  of  failures  per 
call.  Note  that  using  equations  (10a),  (lOg)  and  (10h),  the  failure  rate  per 
call  of  interval  T,  FS  x  T  -  - ( T/t*)  In [Q( X^/v ) ]  with  X?  =  2  PBt*  and 
V=2(PU)Bt  increases  for  this  case  where  the  number  of  bit  errors  per 
measurement  interval ,  (PU)Bt*-l,  exceeds  zero.  By  contrast,  recall  that  in 
the  data  case,  (PlJ)3t*  =  1,  zero  bit  errors  were  allowed,  and  FS  is 
independent  or  t*. 


One  major  problem  encountered  thus  far  in  using  dynamic  repair  with  repeat 
for  voice  is  the  delay  caused  by  the  long  measurement  interval.  As  a  logical 
next  step,  we  investigate  the  dynamic  repair  with  repeat  case  where  the 
measurement  interval,  say  100  bits  at  2400  b/s,  is  very  short.  We  also 
investigate  the  behavior  of  the  measure  of  effectiveness  for  the  repeat  case 
when  the  information  transfer  is  broken  into  n  shorter  packets.  This  requires 
a  generalization  of  equation  (7),  which  represents  the  case  for  n=l . 

In  Appendix  A,  we  derive  the  fol lowing  expression  for  an  n  packet 
information  transfer  of  duration  T=nt*,  under  the  assumption  of  independence 
from  packet  to  packet. 


Pr!" del o  v  =  tl  X  RS(»t*)  =  (  ASRS)  n  +2  (  P)  ( ASRS 1  n-J  ( 1  -A5RS)  J 

j-1  J  i  =  !  5 

(1-2  e  -^sast  (MSRSt)Vk:  (13) 

k=0 

In  equation  (13),  AS,  PS,  and  MS  are,  respecti vely,  the  short  term 
availability,  reliability,  and  average  repair  rate  as  measured  over  the  packet 


inter;;’  r*.  rhe  expression  (  n)  *  (n  i/((n-j )  j !  1  is  the  number  of 

combinations  af  i  packets  taken  j  at  a  time,  and  rep* esents  the  number  of 
mutual '  -j  exc.  :  ways  i  a  which  n-j  packets  are  available  and  reliable,  and  j 

packets  >re  r  avail  able  ard  reliable.  The  term 

, ;  -  2  -  ;-Ksst'<  -.'.i 

K~U 

is  interpreter  r-  k  ■ 3  i)»  -babilUy  that,  j  or  greater  events  occur  in  interval 
t .  Tn«se  “/en1--, .  mean  rate  consist,  of  repair  of  the  failed  path  and, 
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given  the  path  is  repaired,  it  stays  reliable  over  the  packet  duration.  One 

also  notes  that  (1  -  e~M^R^t(MSRSt)k/k 1)  is  the  cumulative 

k=0 

distribution  function  of  a  chi  squared  distribution,  P(x2/V)  with  X?  = 

2MSRSt,  and  V  =  2j  degrees  of  freedom.  For  n=l,  equation  (13)  reduces  to 
equation  (7)  (with  T=0). 

In  numerically  evaluating  equation  (13),  we  have  been  limited  to  a 


just  about  at  the  maximum  of  0.7327  x  10^6  allowed  by  the  IBM  370  employed 
in  the  calculation. 

Figure  2  shows  curves  of  equation  (13)  for  a  constant  message  length  of 
25,600  bits  at  an  error  rate  P=10'4,  as  one  successively  quadruples  the 
number  of  packets  per  message,  up  to  a  maximum  of  n=256  packets.  We  have 
uniformly  applied  the  criterion  that  a  packet  is  reliable  if  it  has  zero  bits 
in  error.  The  improvement  in  response  time  for  data  with  increasing  the 
number  of  packets  per  message,  or  equivalently  shortening  the  packet  length, 
is  striking. 

Although  not  shown  in  Figure  2,  the  corresponding  graph  for  voice 
without  dynamic  repair  at  P=10*4,  B=2400  b/s  and  T=( 25600/2400)s  can  be 
computed  using  an  appropriate  transformation  of  equation  (8)  as 
Pr[ delay  -  T  ^  t]  x  RS(T)  =  U(t)  [l-(l-AS)e-MSt]e"FST. 

Using  the  corresponding  values  of  MS,  FS  From  Table  I,  results  in 

Pr[delay  -  T  ^  t]  x  RS(T)  =  U(t)  [1-1 .76  x  10“84), 

which  is  very  close  to  ideal  performance  (value  U(t))  and  for  practical 
purposes  is  still  better  than  employing  dynamic  repair. 
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IV.  SIGNIFICANT  FINDINGS  AND  CONCLUSIONS 


A  measure  of  effectiveness  (M.O.E.)  was  formulated  in  terms  of  the 
product  of  the  probability  (delay  _<_  t)  x  probability  (the  information  is 
reliably  transferred  over  its  interval)  given  an  environment/stress  level. 
Partitioning  into  long  and  short  term  parameters  allowed  accounting  for 
qual ity/accuracy  effects  in  terms  of  short  term  parameters,  using  a  Poisson 
model,  which  were  related  to  average  error  probability,  bit  rate,  unacceptable 
error  probability  threshold  and  measurement  interval. 

The  M.O.E.  was  evaluated,  omitting  queueing  delays,  for  two  basic  cases: 
(1)  no  dynamic  repairs  (during  the  information  transfer  interval),  and  (2) 
dynamic  repair  via  repeat.  For  the  no  dynamic  repair  case,  with  an 
exponentially  distributed  information  transfer  interval  of  average  duration 
1/M,  the  reliability  (part  of  the  M.O.E.)  is  asymptotic  to  M/(M*-FL*-FS)  where 
FL  and  FS  are  the  mean  long  and  snort  term  failure  rates.  Such  reliability 
effects  are  illustrated,  by  the  relative  robustness  of  higher  bit  rates  for 
voice  and  by  poor  data  performance.  ITSTEC  is  shown  to  oe  a  substantial 
improvement,  but  will  still  be  severely  limited  by  short  term  failure  rates. 

In  the  dynamic  repair  via  repeat  case,  the  M.O.E.  was  derived  for  a  constant 
information  transfer  duration  with  short  failures  and  repairs.  As  the  delay 
approaches  infinity,  Pr[delay  <  t]  x  R  approaches  1  with  dynamic  repair,  and 
R  without  it.  Performance  variation  as  a  function  of  error  probability  and 
measurement  interval  is  studied  for  data  and  voice,  at  2.4  kb/s  based  on  a 
random  channel  model.  Dynamic  repair  via  repeat  functions  well  for  data,  but 
delays  inhibit  its  utility  for  voice. 

Finally,  assuming  short  term  poisson  failures  and  exponentially 
distributed  repair  times,  and  an  n  packet  message,  an  expression  for  the 
M.O.E.  (Pr[delay  <  t]  x  R)  is  derived  and  physically  interpreted.  Response 
time  improvements  of  packetizing  data  are  shown. 

In  view  of  the  poor  reliability  of  no  dynamic  repair,  and  the  intolerable 
delay  of  dynamic  repair  via  repeat  for  voice,  dynamic  repair  via  fast 
switching  deserves  study  and  consideration. 
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Glossary  of  Terms 


a  event  that  the  path  is  available  (in  a  nonfat  led  state)  at  the 

instant  of  demand  t. 


a  event  that  the  path  is  not  available  (in  a  failed  state)  at  the 

instant  of  demand  t  (complement  of  a). 

aL  event  that  the  path  is  long  term  available  at  the  instant  of  demand 

t;  i.e.,  counting  only  failures  )>  1  minute  duration,  it  is  in  a 
non-failed  state. 


aL  event  that  the  path  is  not  long  term  available;  i.e.,  counting  only 

failures  >  1  minute  duration,  it  is  in  a  failed  state. 

aS  event  that  the  path  is  short  term  available,  counting  only  failures 

where  the  measurement  interval  is  <  or  =  1  minute  duration  at  the 
instant  of  demand  t  given  event  aL. 


aS  event  that  the  path  is  not  short  term  available  at  the  instant  of 

demand  given  event  aL,  counting  only  failures  involving  a 
measurement  interval  <  or  =  1  minute. 

AL  long  term  availability,  the  probability  of  event  aL,  estimated  a  1 

(sum  total  of  outages  )>  1  minute  duration  divided  by  the  total 
measurement  interval).  AL  =  ML/(MLt-FL). 

ARQ  Automatic  Repeat  Request. 

AS  snort  term  availability:  the  probability  of  event  aS,  estimated  as 

1-  (sum  total  of  outages  involving  <  or  =  1  minute  measurement 
durations  divided  by  a  total  measurement  interval  which  excludes 
outages  wnose  duration  are  <(  1  minute).  AS  =  MS/ ( MS*-  FS ) . 

B  o i t  r ate,  i n  b i t s/t  une . 

CDF  Cumulative  Distribution  Function:  the  probability  that  a  random 

variable  is  <;  or  =  some  specified  value. 

d  tint-  jo  lav,  iv-.i  as  a  random  variable. 

D  1 1 no  d e 1  a y ,  used  as  a  specified  value. 

E,  En v l ronmnn t/s tress  level  1. 
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F  Conditional  average  path  failure  rate,  given  the  patn  was  available 

at  the  start  of  any  measurement  interval.  Estimated  as  1/MTTF.  F  = 
FLt-FS. 

FL  Conditional  average  path  long  term  failure  rate,  given  the  path  was 

long  term  available  at  the  start  of  any  such  event  and  counting  as 
failures  only  those  outages  >  1  minute.  Estimated  as  the  reciprocal 
of  the  mean  conditional  time  to  a  long  term  failure. 

FS  Conditional  average  short  term  failure  rate,  given  the  path  was  long 

term  available  at  the  start  of  any  such  event  and  counting  as 
failures  only  outages  involving  measurement  intervals  <  or  =  1 
minute. 

i.t.  information  transfer 

I FSTEC  Integrated  Switching  and  Teen  Control 

In  natural  logarithm 

L  long  (  >  1  minute) 

M  average  rate  of  i.t.  duration,  or  1/average  message  length. 

ML  average  rate  of  long  term  path  restoral,  i.e.,  reciprocal  of  the 

average  time  to  repair/restore,  including  only  outages  >  1  minute. 

M.O.E.  Measure  of  Effectiveness 

MS  Conditional  average  rate  of  short  term  path  restoral,  given  event 

al,  i.e.,  reciprocal  of  the  average  time  to  repair/restore, 
including  only  restoral  times  associated  with  measurement 
intervals  <;•  or  =  1  minute. 

MTTF  Mean  time  to  failure,  measured  from  a  time  when  the  path  was  in  a 
non-fai led  state. 

MTFR  Mean  time  to  repair/restore,  measured  from  the  time  of  path  failure, 
n  number  of  packets. 

P  probability  of  a  bit  error 

pdf  probability  density  function 

pr  probability 

PU  Unacceptable  bit  error  probability  threshold,  e.g.,  10"^  for  voice. 
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P(X1) 


Cumulative  distribution  function  ( CDF )  of  the  normal  distribution, 
mean  0,  variance  1,  at  specified  value  XI.  As  used  in  this  report 
it  is  also  the  short  term  probability  of  unacceptable  reception. 

Q ( X 1 )  l-P(Xl) 

Q(X2/V)  1-  cumulative  distribution  function  (CDF)  of  a  chi-squared 

distribution  function  with  V  degrees  of  freedom. 

r  Conditional  event  of  the  path  stays  reliable  (does  not  fail)  over  a 

specified  interval,  given  it  was  available  at  the  start  of  the 
interval . 


r  Conditional  event  that  the  path  does  not  stay  reliable  (does  fail) 

over  a  specified  interval,  given  it  was  available  at  the  start  of 
tne  interval. 

re  Conditional  event  that  the  path  is  repaired/restored,  given  it  was 

in  a  fai led  state. 

reL  Conditional  event  that  the  path  is  repaired/restored,  given  it  was 

in  a  failed  state  lasting  >1  minute. 

reS  Conditional  event  that  the  path  is  repaired/restored,  given  it  was 

in  a  failed  state  lasting  <  or  =  1  minute. 

rL  Conditional  event  that  the  path  stays  long  term  reliable  (does  not 

fail,  including  only  outages  >  1  minute)  over  a  specified  interval, 
given  it  was  long  term  available  at  the  start  of  tne  interval. 

rS  Conditional  event  that  the  path  stays  short  term  reliable  (does  not 

fail,  counting  only  failures  with  measurement  intervals  <  or  =  1 
minute  duration)  over  a  Specified  interval,  given  it  was  available 
at  the  start  of  the  interval. 

R  Reliability,  the  probability  of  conditional  event  r. 


R  1-R 

s  Laplace  transform  variable. 

S  short  term,  i.e.,  <  or  =  1  minute, 

t  time 

T  duration  of  information  transfer  interval. 

U(t)  Unit  step  function,  value  0  for  t<0,  1  for  t^O. 

W  Waiting  time  due  to  queueing. 

S(t)  Unit  impulse  function,  value  0  for  t\0,  1  for  t=0. 
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APPENDIX  A 


DERIVATION  OF  THE  CDF  OF  DELAY  X  RS  FOR  AN  N  PACKET  CASE 


From  equation  (6)  in  the  main  body  of  the  report,  the  Laplace 
transform  (of  probability  density  function  of  delay  due  to  repairs)  x  the 
reliability  over  a  packet  interval  t*  is 


L[PR[del ay=t]xRS( t*)  =  AS  RS(t*)  [1+(1 -AS  RS(t*)MS/(AS(s+MSRS(t*))](Al) 


for  an  information  transfer  interval  T=nt*  consisting  of  n  packets.  If  the 
probabilities  are  independent  from  packet  to  packet,  the  corresponding  result 
is,  using  RS(t*)n  =  RS(nt*).  Equation  (A2)  uses  the  fact  that 

L[Pr[del ay=t]x  RS(nt*)  «  [AS  RS(t*)]n  [1+(1-AS  RS(t*)) 

MS/(AS(s+MS  RS ( t* ) ) ]n  (A?) 

since  the  density  function  of  the  sum  of  n  independent,  identically 
distributed  delays  is  their  n  fold  convolution  in  the  time  domain,  or 
equivalently  their  n  fold  product  in  the  Laplace  transform  domain.  Expanding 
(A2)  via  the  Binomial  theorem,  and  for  simplicity  of  notation,  using  AS=A, 
RS(t*)=R,  MS=M,  and  1-ASRS=AR  produces 


n  ARM  1  n  ARM  .  1 

L[Pr[delay=t ]xR( nt* )  =  (  AR)  n[  ln  +  (  )  (T)  (s+MR)  +  (j  )  {'7T)J  (sWP 

1  r)"]  (A3) 


+  . 


’  [n  1  A  s+MR' 


where  (.  )  =  nl/[(n-j)'.j  i], 
0 


Multiplication  of  the  right  side  of  (A3)  by  1/s  is  equivalent  to  integration 
from  0  to  t  in  the  time  domain.  Hence,  the  contribution  to  the  L[Pr[delay  < 


or  =t]]xR(nt*)  of  the  ji!)  term  from  (A3)  will  be  (AR)n(l?  )  (^5^)3  J 
1  .  J  A  S 

(s+MR)i.  Expanding  this  term  in  terms  of  a  partial  fraction 

expansion  produces 


(AR)"  (j  ) 


ARM 

nnj 


(s+MR)J 


(AR)_n(j 


zl&l -i"2i 
[s+MR]'*" 

Inverting  the  above 


-  I 

S+MR 


(ARM).i 


(MT}1 


transorm  to  tHe  time  domain  yields 


(MR) 

[s+MR]  1 


( AR )  n ( j  )  (Jjj  )  i  [U(  t) 

(^tje^^t  -e'MRt] 

- — TT - 


(PRt)J-le-MRt  -  (MRt)j-2e-MRt  -  .  .  . 

(FTT! 


A-l 


n  AR 

which  is  of  the  form  (AR)n  (  j)  (  £E")j 
Therefore, 


j=l 

1  .  5  e-MRt  (URt)K 
k=0  ,  , 


Pr[delay  <  or  =t]xR(nt*)  =  ( AR ) n  *■  5  (j  )  (AR)n-J  (l-AR)J 

(  . 

One  notes  for  n=l,  (A4)  reduces  to  equation  (7)  with  T=0.  For  n=2 


(A4) 


Pr[delay  <  or  =t]  x  R(nt*)  =  (AR)2  *  2 AR(I-AR)  (1-e"MRt)  ►  ( 1  -AR ) 2 

( l-e-MRt( lvMRt) ) .  (A5) 

The  (AR)2  term,  which  is  also  the  initial  value  (t=0),  represents  the 
probability  both  packets  are  available  and  stay  reliable  over  their  respective 
intervals,  t*.  The  second  term  represents  the  two  possible  events  that  the 
first  (or  second)  packet  is  available  and  stays  reliable  (AR),  and  that  the 
second  (or  first)  packet  is  not  available  and  reliable  (1-AR)  and  is  repaired 
within  time  t  and  stays  reliable  (l-e'^Rt).  The  third  term  represents  the 
event  where  both  packets  are  not  available  and  reliable  and  are  repaired 
within  interval  t  and  stay  reliable. 
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